Compression scheme for improving cache behavior in database systems

ABSTRACT

The apparatuses and methods described herein may operate to identify, from an index structure stored in memory, a reference minimum bounding shape that encloses at least one minimum bounding shape. Each of the at least one minimum bounding shape may correspond to a data object associated with a leaf node of the index structure. Coordinates of a point of the at least one minimum bounding shape may be associated with a set of first values to produce a relative representation of the at least one minimum bounding shape. The set of first values may be calculated relative to coordinates of a reference point of the reference minimum bounding shape such that each of the set of first values comprises a first number of significant bits fewer than a second number of significant bits representing a second value associated with a corresponding one of absolute coordinates of the point.

CROSS-REFERENCES TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 13/360,483 filed Jan. 27, 2013, which is a continuation of U.S.patent application Ser. No. 12/847,475 filed Jul. 30, 2010, now issuedas U.S. Pat. No. 8,121,987, which is a continuation of U.S. patentapplication Ser. No. 11/867,115 filed Oct. 4, 2007, now issued as U.S.Pat. No. 7,797,296, which is a continuation of U.S. application Ser. No.10/087,360 filed Mar. 1, 2002, now issued as U.S. Pat. No. 7,283,987,and claims the benefit of U.S. Provisional Application Ser. No.60/272,828, filed Mar. 5, 2001, entitled “COMPRESSION SCHEME FORIMPROVING INDEX CACHE BEHAVIOR IN MAIN-MEMORY DATABASE,” all of whichapplications are incorporated herein by reference in their entirety.

BACKGROUND OF THE INVENTION

I. Technical Field

Various embodiments of the invention relate generally to databasesystems. More particularly, various embodiments of the invention relateto a compression scheme for improving index cache behavior inmain-memory database systems.

II. Description of the Related Art

With server DRAM modules priced at less than $2,000/GB, many of thedatabase tables and indexes can now fit in the main memory of moderncomputer systems. It is predicted that it will be common to haveterabytes of main memory for a database within ten years or so.

With such a large amount of memory, the traditional bottleneck of diskaccess almost disappears, especially for search transactions. Instead,memory access becomes a new bottleneck. A recent study with commercialDBMSs shows that half of the execution time is spent on memory accesswhen the whole database resides in memory.

Since the speed in DRAM chips has been traded off for the capacity, thegap between the CPU speed and the DRAM speed has grown significantlyduring the past decade. In today's computer systems, each memory accesscosts tens of processor cycles. To overcome this gap, modern processorsadopt up to several megabytes of SRAM as the cache, which can beaccessed in just one or two processor cycles.

Recognizing the widening gap between the CPU speed and the DRAM speed,the importance of the cache behavior in the design of main memoryindexes was emphasized. It was shown that the cache-conscious searchtrees (“CSS-trees”) perform lookups much faster than binary search treesand T-trees in the read-only environment. B+-trees and their variantswere shown to exhibit a reasonably good cache behavior.

For example, CSB+-trees (“Cache Sensitive B+-trees”) store child nodescontiguously in memory to eliminate most child pointers in the nodesexcept the first one. The location of the i-th child node is computedfrom that of the first child. Providing more room for keys in the node,this pointer elimination approach effectively doubles the fanout of aB+-tree. Given the node size in the order of the cache block size, thefanout doubling reduces the height of the B+-tree, which again leads tosmaller number of cache misses during the tree traversal.

Note that such a pointer elimination technique does not provide muchbenefit in disk-based indexes where the fanout is typically in the orderof a few hundreds and doubling the fanout does not lead to an immediatereduction in the tree height.

However, the pointer elimination technique cannot be directly applied tomultidimensional index structures such as the R-tree, which havenumerous application domains such as spatio-temporal databases, datawarehouses, and directory servers. The data object stored in an R-treeare approximated by, so called, minimum bounding rectangles (“MBRs”) inthe multidimensional index space, where each MBR is the minimalhyper-rectangle (i.e. 2-dimensional or higher-dimensional rectangle orbox) enclosing the corresponding data object. Those skilled in the artwould appreciate the MBR may be extended to a multi-dimensional shapeincluding boxes or pyramids.

Typically, MBRs are much larger than pointers. Thus, pointer eliminationalone cannot widen the index tree to reduce the tree heightsignificantly. For example, when the 16-byte MBR is used for thetwo-dimensional key, the simple elimination of a 4-byte pointer providesat most 25% more room for the keys, and this increase is not big enoughto make any significant difference in the tree height for the improvedcache behavior. Therefore, there is a need for a scheme for improvingcache behavior to in accessing multidimensional indexes to accessmain-memory database.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A, 1B and 1C are illustrations of the QRMBR technique of thepresent invention, according to various embodiments.

FIGS. 2A, 2B and 2C are illustrations of the data structure of theCR-tree, according to various embodiments.

FIGS. 3A through 3G are flow charts relating to manipulation of theCR-tree, according to various embodiments.

FIGS. 4A through 4C are illustrations of the data structures changes forthree CR-tree variants, according to various embodiments.

FIGS. 5A and 5B are graphs showing the node accesses in R-trees andCR-trees, according to various embodiments.

FIGS. 6A and 6B are graphs showing the cache misses in R-trees andCR-trees.

FIGS. 7A and 7B are graphs showing the increase of optimal node sizewith query selectivity in 2D R-trees and CR-trees, according to variousembodiments.

FIGS. 8A and 8B are graphs showing the false hit ratio by QRMBR size anddimensionality, according to various embodiments.

FIGS. 9A and 9B are graphs showing the search performance of bulk-loaded2D trees with a uniform data set, according to various embodiments.

FIGS. 10A and 10B are graphs showing the search performance ofbulk-loaded 2D trees with a skewed data set, according to variousembodiments.

FIGS. 11A and 11B are graphs showing the search time of 2D R-trees andCR-trees with varying cardinality, according to various embodiments.

FIGS. 12A and 12B are graphs showing the update performance onbulk-loaded trees with a uniform data set, according to variousembodiments.

FIGS. 13A and 13B are graphs showing the search performance afterinsertion or deletion, according to various embodiments.

FIGS. 14A, 14B and 14C are graphs showing the ratio of false hitsincurred by quantization, according to various embodiments.

FIGS. 15A and 15B are graphs showing the increase of MBR size withvarying quantization levels, according to various embodiments.

FIGS. 16A, 16B and 16C are graphs showing the search time with varyingquantization levels, according to various embodiments.

FIGS. 17A and 17B are graphs showing the amount of accessed index data,according to various embodiments.

FIGS. 18A and 18B are graphs showing the number of L2 cache misses,according to various embodiments.

FIGS. 19A and 19B are graphs showing the number of key comparisons,according to various embodiments.

FIGS. 20A and 20B are graphs showing the comparison of analytical andexperimental results for CR-trees, according to various embodiments.

DETAILED DESCRIPTION

Recognizing that the MBR keys occupy most of index data in themultidimensional index, R-trees, various embodiments of the presentinvention achieve inexpensive compression of MBR keys to improve theindex cache behavior. In one embodiment, for example, a novel treestructure, called “CR-Tree” (Cache-conscious R-Tree) is proposed, wherethe child nodes are grouped into a parent node so that each nodeoccupies only a small portion of the data space of its parent node. InCR-tree, an MBR is represented relative to its parent MBR so that thecoordinates of the resultant relative MBR have a fewer number ofsignificant bits with many leading 0's. To further reduce the number ofbits per MBR, the CR-tree also cuts off trailing insignificant bits byquantization.

In various embodiment, the analytical results and the experimentalresults agree showing that the compression technique can reduce the MBRsize to less than one fourth of the uncompressed one, thereby increasingthe fanout by more than 150%. A potential problem with the proposedtechnique is that the information loss by quantization may incur falsehits, which have to be filtered out through a subsequent refinementstep. However, requiring this refinement step itself is not a problembecause it is needed in most multidimensional indexes, and it ispossible to keep the number of false hits negligibly small by choosingthe quantization level properly such that the cost of filtering outfalse hits can be paid off by the significant saving in cache misses.

Various embodiments also includes several alternative designs of CR-treeincluding whether to use the pointer elimination technique introduced inthe CSB+-tree, whether to apply the proposed compression technique toleaf nodes or not, the choice of quantization levels, and the choice ofnode size. The experimental results show that all the resultant CR-treevariants significantly outperform the R-tree in terms of the searchperformance and the space requirement. The basic CR-tree that uses onlythe proposed technique performs search operations faster than the R-treewhile performing update operations similarly to the R-tree and usingless memory space. Compared with the basic CR-tree, most of CR-treevariants use less memory space with some algorithmic overhead.

In various embodiments, for example, a reference minimum bounding shapethat encloses at least one minimum bounding shape may be identified froman index structure stored in memory. Each of the at least one minimumbounding shape may correspond to a data object associated with a leafnode of the index structure. Coordinates of a point of the at least oneminimum bounding shape may be associated with a set of first values toproduce a relative representation of the at least one minimum boundingshape. The set of first values may be calculated relative to coordinatesof a reference point of the reference minimum bounding shape such thateach of the set of first values comprises a first number of significantbits fewer than a second number of significant bits representing asecond value associated with a corresponding one of absolute coordinatesof the point.

In various embodiments, for example, the relative representation may becompressed using a finite level of quantization to produce a quantizedrepresentation of the at least one minimum bounding shape. Alsodisclosed are methods, systems and non-transitory computer-readablestorage devices for accomplishing the same scheme as described above.

Various embodiments may be based on making the R-tree cache-conscious bycompressing MBRs. An R-Tree is a height-balanced tree structure designedspecifically for indexing multi-dimensional data objects in a database.It stores the minimum bounding rectangle (“MBR”) with 2 or higherdimension of an data object as the key in the leaf pages. Variousembodiments of the present invention are also applicable to a variant ofR-Tree called R*-Tree which improves the search performance by using abetter heuristic for redistributing entries and dynamically reorganizingthe tree during insertion. Those skilled in the art would appreciatethat various embodiments of the present invention are readily applicableto other variants of R-Tree such as R+-Tree, Hilbert R-Tree, or anX-tree.

FIG. 1 illustrates the compression scheme used in various embodiments ofthe present invention. FIG. 1A shows the absolute coordinates ofR0.about.R3. FIG. 1B shows that the coordinates of R1˜R3 representedrelative to the lower left corner of R0. These relative coordinates havea lesser number of significant bits than absolute coordinates. FIG. 1Cshows the coordinates of R1˜R3 quantized into 16 levels or four bits bycutting off trailing insignificant bits. The resultant MBR is calledquantized MBR (“QRMBR”). Note that QRMBRs can be slightly larger thanoriginal MBRs.

One example embodiment of the present invention is an index tree, calledCR-tree (“cache-conscious R-tree”), a R-tree variant that uses QRMBRs asindex keys. The number of quantization levels may be the same for allthe nodes in a CR-tree.

FIG. 2A shows the data structure of a CR-tree node. Each node cancontain up to M entries. In addition, it keeps a flag 201 indicatingwhether it is a leaf node or non-leaf node (internal node), the numberof stored entries 202, the reference MBR 203 that tightly encloses itsentire child MBRs, a number of entries such as 204. The reference MBR isused to calculate the QRMBRs stored in the node.

FIG. 2B shows nonleaf nodes (internal nodes) that store entries in theform of (QRMBR 211, ptr 212), where QRMBR 211 is a quantized relativerepresentation of the child node MBR, and ptr 212 is the address of achild node.

FIG. 2C shows leaf nodes that store entries in the form of (QRMBR 221,ptr 222), where QRMBR 221 is a quantized relative representation of theobject MBR and ptr 222 refers to a data object. Typically, each of x andy coordinates are quantized into 256 levels or one byte.

For example, one goal of various embodiments of the present invention isto reduce the index search time in main memory databases, especiallyusing multidimensional indexes. In disk-based indexes, the disk accesscost is almost irrelevant to the node size when moderately sized but thememory access cost is nearly proportional to the node size. Whiledisk-based indexes are designed such that the number of disk accesses isminimized, main memory indexes need to be designed such that the amountof accessed index data or c·N_(node access) is minimized, where cdenotes the node size in cache blocks and N_(node access) denotes thenumber of accessed nodes.

In main memory indexes, the search time mainly consists of the keycomparison time and the memory access time incurred by cache misses. Ifa cache miss occurs, the CPU has to wait until the missing data arecached. A cache miss can occur for three reasons: missing data, missinginstructions, and missing TLB (table look-aside buffer) entries, whichare needed to map a virtual memory address to a physical address.Therefore, the goal is expressed as minimizing

T _(index search) ≅T _(key compare) +T _(data cache) +T _(tLB cache)

-   -   where T_(key compare) is the time spent comparing keys that are        cached, T_(data cache) is the time spent caching data,        T_(TLB cache) is the time spent caching TLB entries. For the        purpose of illustration, the caching time for missing        instructions is omitted because the number of instruction misses        mostly depends on the compiler used and the caching time is hard        to control.

Let c be the size of a node in cache blocks, and let N_(node access) bethe number of nodes accessed processing a query. Let C_(key compare) bethe key comparison cost per cache block and C_(cache miss) be the costof replacing a cache block. Let C_(TLB miss) be the cost of handling asingle TLB miss. When the size of a node is smaller than that of amemory page, each access to a node incurs at most one TLB miss. For thepurpose of illustration, it is assumed that nodes have been allocatedrandomly and that no node and no TLB entry are cached initially. Then,

T _(index search) =c×C _(key compare) ×N _(node access) c×C_(cache miss) ×N _(node access) +C _(TLB miss) ×N _(node access) c×N_(node access)×(C _(key compare) +C _(cache miss) +C _(TLB miss) /c)

Since C_(cache miss) and C_(TLB miss) are constant for a given platform,it is possible to control three parameters: c, C_(key compare), andN_(node access). Among them, it is not expected to reduceC_(key compare) noticeably because the key comparison is generally verysimple. In addition, C_(TLB miss) and C_(cache miss) typically havesimilar values. Therefore, the index search time mostly depends onc·N_(node access).

It is observed that the amount of accessed index data can be bestreduced by compressing index entries, c·N_(node access) can be minimizedin three ways: changing the node size such that c·N_(node access)becomes minimal, packing more entries into a fixed-size node, andclustering index entries into nodes efficiently. The second is oftentermed as compression and the third as clustering.

The optimal node size is equal to the cache block size inone-dimensional case. In one-dimensional trees like the B+-tree, sinceexactly one internal node is accessed for each height even for the rangequery, the number of visited internal nodes decreases logarithmically inthe node size. On the other hand, the number of visited leaf nodesdecreases linearly with the node size, and c increases linearly with thenode size. Therefore, c·N_(node access) increases with the node size,and thus it is minimal when c is one.

In multidimensional indexes, more than one internal nodes of the sameheight can be accessed even for the exact match query, and the number ofaccessed nodes of the same height decreases as the node size increases.Since this decrease is combined with the logscale decrease of treeheight, there is a possibility that the combined decrease rate of nodeaccesses exceeds the linear increase rate of c. It will be shownanalytically that the optimal node size depends on several factors likethe query selectivity and the cardinality (the number of entries in theindex structure).

Compressing index entries is equivalent to increasing the node sizewithout increasing c. In other words, it reduces N_(node access) whilekeeping c fixed. Thus, it is highly desirable. Compression has beenaddressed frequently in disk-based indexes because it can reduce thetree height, but there is little dedicated work, especially inmultidimensional indexes. The following analysis shows that whycompression is not important in disk-based indexes but is important inmain memory indexes.

Suppose that the tree A can pack f entries on average in a node and thetree B can pack 2f entries in a node using a good compression scheme.Then, their expected height is log_(f) N and log_(2f) N, respectively.Thus, the height of B is 1/log₂ f+1 (=log_(f) N/log_(2f) N) timessmaller than that of A. In disk-based indexes, the typical size of anode varies from 4 KB to 64 KB. Assuming that the node size is 8 KB andnodes are 70% full, f is 716 (≅8192×0.7/8) for a B+-tree index and about286 (≅8192×0.7/20) for a two-dimensional R-tree. Thus, 1/log₂ f istypically around 0.1. On the other hand, the size of a node is small inmain memory indexes. With a node occupying two cache blocks or 128bytes, f is about 11 for a B+-tree and about 4 for a two-dimensionalR-tree. Thus, 1/log₂ f is 0.29 for the B+-tree and 0.5 for the R-tree.In summary, node compression can reduce the height of main memoryindexes significantly because the size of nodes is small.

Clustering has been studied extensively in disk-based index structures.In terms of clustering, the B+-tree is optimal in one-dimensional space,but no optimal clustering scheme is known for the multidimensional case.Instead, many heuristic schemes have been studied in variousmultidimensional index structures. Various embodiments of the presentinvention may be used with most of these clustering schemes.

MBR Compression

There are two desirable properties for a MBR compression scheme. One isthe property of overlap check without decompression. Whether two MBRsoverlap or not can be determined directly from the correspondingcompressed MBRs, without decompressing them. A basic R-tree operation isto check whether each MBR in a node overlaps a given query rectangle.Therefore, when storing compressed MBRs in a node, this property allowsthe R-tree operation to be performed by compressing the query rectangleonce instead of decompressing all the compressed MBRs in the node.

The other property is simplicity. Compression and decompression shouldbe computationally simple and can be performed only with already cacheddata. Conventional lossless compression algorithms as the one used inthe GNU gzip program are expensive in terms of both computation andmemory access because most of them maintain an entropy-based mappingtable and look up the table for compression and decompression. Althoughthey may be useful for disk-based indexes, they are not adequate formain memory indexes.

RMBR Technique

One way to compress to represent keys relatively to a reference MBRwithin a node. If the coordinates of an MBR are represented relative tothe lower left corner of its parent MBR, the resultant relativecoordinates have many leading 0's. In the relative representation of MBR(“RMBR”), cutting off these leading 0's make it possible to effectivelyreduce the MBR size.

Let P and C be MBRs, which is represented by their lower left and upperright coordinates (xl, yl, xh, yh), and let P enclose C. Then, therelative representation of C with respect to P has the coordinatesrelative to the lower left corner of P.

RMBR _(P)(C)=(C·xl−P·xl,C·yl−P·yl,C·xh−P·xl,C·yh−P·yl)

However, the following simple analysis shows that the RMBR technique cansave only about 32 bits per MBR. For simplicity, it is assumed that thecoordinates of MBR are uniformly distributed in their domain and thatR-tree nodes of the same height have square-like MBRs roughly of thesame size. Without loss of generality, it is assumed that the domain ofx coordinates has the unit length and consists of 232 different valuesequally spaced.

Let f be the average fanout of leaf nodes, and let N be the total numberof data objects. Then, there are roughly N/f leaf nodes, whose MBRs havethe area of f/N and the side length of √{square root over (f/N)} alongeach axis. Since there are 2³² different values in the unit intervalalong each axis, there are 2³² √{square root over (f/N)} differentvalues in the interval with the length of √{square root over (f/N)}.Therefore, it is possible to save 32−log₂ (2³² √{square root over(f/N)}) bits or log₂ √{square root over (N/f)} bits for each xcoordinate value. When N is one million and f is 11, about 8.2 bits aresaved. By multiplying by 4, it is possible to save about 32 bits perMBR. Note that the number of saved bits does not depend on the originalnumber of bits as long as the former is smaller than the latter.

It is possible to easily extend this analysis result such that thenumber of bits saved is parameterized further by the dimensionality. Theextended result is log₂

$\sqrt[d]{N/f}$

or

(log₂ N−log₂ f)/d  (1)

Formula (1) increases logarithmically in N, decreases logarithmically inf, but decreases linearly with d. Therefore, the number of saved bitsmainly depends on the dimensionality. In one-dimensional space, therelative representation technique can save almost 16 bits for eachscalar, but it becomes useless as the dimensionality increases.

QRMBR Technique

In addition to the RMBR technique, quantization may be performed as anadditional step for further compression. In the quantized RMBR (“QRMBR”)technique, the quantization step cuts off trailing insignificant bitsfrom an RMBR whereas the RMBR technique cuts off leadingnon-discriminating bits from an MBR. It is shown below that quantizingan RMBR does not affect the correctness of index search, and that itssmall overhead by quantization is justified by a significant saving incache misses.

Let I be a reference MBR, and let l be a desired number of quantizationlevels. Then, the corresponding quantized relative representation of anMBR, C, is defined as

QRMBR_(I,l)(C)=(φ_(I·xl,I·xh,l)(C·xi),φ_(I·yl,I·yh,l)(C·yl),Φ_(I·xl,I·xh,l)(C·xh),φ_(I·yl,I·yh,l)(C·yh)),

-   -   where φ_(a,b,l): R→{0, . . . , l−1} and φ_(a,b,l):R→{1, . . . ,        l} are

${\varphi_{a,b,l}(r)} = \left\{ {{\begin{matrix}{0,} & {{{if}\mspace{14mu} r} \leq a} \\{{l - 1},} & {{{if}\mspace{14mu} r} \geq b} \\{\left\lfloor {{l\left( {r - a} \right)}/\left( {b - a} \right)} \right\rfloor,} & {otherwise}\end{matrix}{\Phi_{a,b,l}(r)}} = \left\{ \begin{matrix}{1,} & {{{if}\mspace{14mu} r} \leq a} \\{l,} & {{{if}\mspace{14mu} r} \geq b} \\{\left\lceil {{l\left( {r - a} \right)}/\left( {b - a} \right)} \right\rceil,} & {otherwise}\end{matrix} \right.} \right.$

The following Lemma says that QRMBR satisfies the first of two desirableproperties. Therefore, the computational overhead of QRMBR technique isthe cost of compressing the query rectangle into a QRMBR for eachvisited node. In the present implementation, compressing an MBR into aQRMBR consumes at most about 60 instructions, which corresponds to lessthan 120 ns on a 400 MHz processor because of pipelining. In addition,it incurs no memory access as long as the query MBR and the MBR of thenode on immediate access are cached.

Lemma 1:

Let A and B be MBRs. For any MBR I and integer l, it holds that ifQRMBR_(I,l)(A) and QRMBR_(I,l)(B) do not overlap, A and B also do notoverlap.

Proof:

It is proved by proving the contrapositive that if A and B overlap,QRMBR_(I,l)(∝) and QRMBR_(I,l)(B) overlap. By definition, two rectanglesoverlap if and only if they share at least one point. Thus, A and Bshare at least one point. Let (x, y) denote this point. Then, thefollowing holds.

A·xl≦x≦A·xh, A·yl≦y≦A·yh

B·xl≦x≦B·xh, B·yl≦y≦B·yh

For simplicity, the subscripts a, b, and l are omitted from thequantization functions φ and Φ. Since, φ and Φ are monotonicallynon-decreasing functions and φ(r)≦Φ(r) for any rεR,

φ(A·xl)≦φ(x)≦Φ)(x)≦Φ(A·xh), φ(A·yl)≦φ(y)≦Φ(y)≦Φ(A·yh)

φ(B·xl)≦φ(x)≦Φ(x)≦Φ(B·xh), φ(B·yl)≦φ(y)≦Φ(y)≦Φ(B·yh)

Thus, QRMBR_(I,l)(A) and QRMBR_(I,l)(B) share at least the point (φ(x),φ(y)). Hence, they overlap, which completes the proof.

Since it is generally not possible to recover the original coordinatesof an MBR from its QRMBR, there is the possibility of incorrectlydetermining the overlap relationship between two MBRs. However, Lemma 1guarantees that there is no possibility of saying two actuallyoverlapping MBRs do not overlap. Thus, the QRMBR technique does not missa data object that satisfies a query.

However, there is still a possibility that two actually non-overlappingMBRs may overlap. This means that the result of index search may containfalse hits that have to be filtered out through a subsequent refinementstep. This refinement step is needed for most multidimensional indexstructures because it is often the case that MBRs are not exact keys ofdata objects. Thus, requiring the refinement step itself is not anoverhead, but the number of false hits can be. The number of false hitscan be made negligibly small, such as fewer than one percent, bychoosing the quantization level properly.

CR-tree According to Various Embodiments of the Present Invention

FIGS. 3A through 3G show the procedures relating to the CR-treeaccording to various embodiments of the present invention. The two maindifferences between the algorithms of CR-tree and those conventionalR-tree variants are: the CR-tree stores QRMBRs in the nodes, andmaintains them as its MBR grows or shrinks.

FIG. 3A shows the flow chart of the search procedure, which is similarto those used in other R-tree variants, except that the CR-tree needs tocompare a query rectangle to QRMBRs in the nodes. Instead of recoveringMBRs from QRMBRs, the CR-tree transforms the query rectangle into thecorresponding QRMBR using the MBR of each node as the reference MBR.Then, it compares two QRMBRs to determine whether they overlap.

Search Procedure:

Given a CR-tree and a query rectangle Q, find all index records whoseQRMBRs overlap Q.

1. Push the root node to the initially empty stack S (step 301).

2. If S is empty (step 302), return the result set (step 303) and stop(step 304).

3. If S is not empty (step 302), pop a node N from S (step 305) and setR to be QKMBR_(N,MBR,l)(Q) (step 306).

4. If N is not a leaf node (step 307), check each entry E to determinewhether E.QRMBR overlaps R. If so, push E.ptr to S (step 308).

5. If N is a leaf node (step 307), check all entries E to determinewhether E.QRMBR overlaps R. If so, add E.ptr to the result set (step309).

6. Repeat from step 2.

FIG. 3B shows the flow chart of the Insert procedure. The Insertprocedure inserts a new data object O whose MBR is C into a CR-tree byinvoking the ChooseLeaf and Install procedures. The SplitNode andAdjustTree procedures may also be invoked if needed. The Installprocedure installs a pair of an MBR C and a data object pointer p in anode N by enlarging N.MBR such that it encloses C and by making an entryof (QRMBR_(N.MBR,I)(C), p) and appending it to N. If N.MBR has beenenlarged, recalculate all the QRMBRs in N by accessing their actual MBRsand invoke the AdjustTree Procedure passing N.

To insert a new data object (step 315), the CR-tree descends itself fromthe root by choosing the child node that needs the least enlargement toenclose the new key of the object MBR. If the node's MBR enclose the newkey (step 316), a relative key is calculated for the new entry (step317). If the node's MBR does not enclose the new key (step 316), thenode's MBR must be enlarged to enclose the new key (step 318).

When visiting an internal node to choose one of its children, the objectMBR is first transformed into the QRMBR using the node MBR as thereference MBR. Then, the enlargement is calculated between a pair ofQRMBRs. Relative keys are calculated for all entries (step 319). When aleaf node is reached, the node MBR is first adjusted such that itencloses the object MBR. Then, an index entry for the data object iscreated in the node. If the node MBR has been adjusted, the QRMBRs inthe node are recalculated because their reference MBR has been changed.

If the node overflows (step 312), it is split (step 313) and the splitpropagates up the tree.

FIG. 3C shows the flow chart of the ChooseLeaf procedure that select aleaf node in which to place a new MBR C descending a CR-tree from theroot. Starting from a root node, if the selective node is not a leafnode (step 332), the insert key is made relative to the node's MBR (step335) and a child node is selected that needs the minimum enlargement toenclose the relative insert key (step 336). The process is repeateduntil the leaf node is reached (step 333).

FIG. 3D shows the flow chart of the SplitNode procedure that splits anode into two based on the linear split algorithm used for an R-tree.The QRMBRs in the nodes need to be recalculated according to their MBR.Splitting can be done using other split algorithms used for R-tree andits variants, such as the quadratic split algorithm.

The pair of farthest entries is chosen as seeds (step 341). Two nodesare made and each seed is assigned to a node (step 342). Pick any ofremaining entries (step 344) and assign the entry to the node thatrequires the least enlargement of its MBR to include the chosen entry(step 347). If one node has (M-m) entries (step 345) where M is themaximum number of entries in a node and m is a predefined minimum numberof entries in a node, all the remaining entries should be assigned tothe other node (step 346). This step makes each node have at least mentries. If all the entries are assigned (step 343), the MBR of eachnode is obtained, and the relative keys in each node are calculated(step 348).

If the node under split is the root, a new root is made and two splitnodes are added to the root as its children (step 349). MBRs andrelative keys of the root are recalculated (step 351). If the node undersplit is not the root and if the parent node is full (step 352), theparent node is split (step 353). If the parent node is not full (step352) and if the parent node's MBR enclose the new key (step 355),relative keys are recalculated only for two new entries (step 376). Ifthe parent node's MBR does not enclose the new key (step 355), theparent node's MBR is enlarged to enclose the new key (step 357), and therelative keys in the node are recalculated for all entries (step 358),and the tree is adjusted (step 359).

FIG. 3E shows the AdjustTree procedure that ascends from a leaf node Lup to the root, adjusting MBRs of nodes and propagating node splits asnecessary. When a node MBR has been adjusted, the QRMBRs in the node arerecalculated. First, it is checked whether the enlarged node is the root(step 371). If the enlarged node is not the root and if the parentnode's MBR enclose the node's MBR (step 373), the parent node's entriesare updated and the relative keys are recalculated (step 374). If theparent node's MBR does not enclose the node's MBR (step 373), the parentnode's MBR is enlarged to enclose the enlarged node's MBR (step 375).The parent node's entries are updated and all relative keys of theparent node are recalculated (step 377), and the parent node is set tobe an enlarged node (step 378).

FIG. 3F shows the flow chart of the Delete procedure that removes indexrecord E from a CR-tree. The delete key is transformed to be relative tothe node's MBR (step 386). If the node is a leaf node (step 387), eachentry within the node is compared with the delete object (step 389). Ifany entry matches the delete object (step 390), the entry is deletedfrom the leaf node (step 391). If the deleted entry's key touches theleaf node's MBR (step 393), each entry's relative key is recalculated(step 394), and the tree is condensed (step 395).

FIG. 3G shows the flow chart of the CondenseTree procedure. If a leafnode L from which an entry has been deleted leaving the leaf node withfew entries as a result, the CondenseTree procedure eliminates the nodeand relocates all its entries. Node elimination is propagated upward asnecessary. Adjust all MBRs of the nodes on the path to the root, makingthem smaller if possible. When a node's MBR has been adjusted, theQRMBRs in the node is recalculated.

First, the parent's entry of a shrunk node is found step (401). If theentry's key does not touch the parent node's (step 402), the shrunknode's entry in the parent node is updated (step 409) and stopped (step410). If the entry's key touches the parent node's MBR (step 402), theentry's key and the shrunk node's MBR are compared (step 403). If theparent node may not be shrunk (step 404), the shrunk node's entry in theparent node is updated (step 405). But, if the parent node can be shrunk(step 404), the shrunk node's entry in the parent node is updated (step405). The parent node's MBR is recalculated (step 406) and the relativekey of each entry in the parent is also recalculated (step 407). Theparent node is set as a shrunk node (step 408).

Those skilled in the art appreciate that any of the deletion algorithmsused in the R-tree and the R*-tree may also be used with a slightmodification.

Bulk Loading

Bulk loading into a CR-tree is no different from that into other R-treevariants. As long as QRMBRs are correctly maintained, existing bottom-uploading algorithms can be used directly.

CR-Tree Variants

FIGS. 4A, 4B and 4C show three variants of the CR-tree according tovarious embodiments of the present invention, namely, the PE CR-tree,the SE CR-tree, and the FF CR-tree.

FIG. 4A shows the first variation, called the PE (“pointer-eliminated”)CR-tree, that eliminates pointers to child nodes from internal nodesexcept the pointer of the first entry, similar to the CSB+-tree. Eachnode includes a field 421 for indicating whether the node is a leaf nodeor a non-leaf (internal) node, a field 422 for indicating the number ofentries in the node, a field 423 for storing a reference MBR, a field424 for storing a pointer to a child node, and fields such as 425 forstoring QRMBR.

The PE CR-tree widens the CR-tree significantly by eliminating most ofthe pointers because the key size of the CR-tree is smaller than theR-tree, for example. If the QRMBR size is four bytes, this extensiondoubles the fanout of internal nodes when the pointer is 4 bytes.

It is noted that the pointers to data objects stored in leaf nodes canrarely be eliminated. When the average fanout of both internal and leafnodes is 10, the number of internal nodes is about one ninth of that ofleaf nodes. Therefore, the overall increase of fanout is only about 10%.

On the other hand, since the pointer elimination technique works bystoring the child nodes with the same parent consecutively, splitting anode becomes expensive. The new node created by a split has to be storedconsecutively with its siblings, and this often requires allocating anew space and moving the siblings into the space.

FIG. 4B shows the second variation, called the SE (“space efficient”)CR-tree that removes the reference MBR from nodes of the basic CR-tree.It makes use of the fact that the reference MBR of each node can beobtained from the matching entry in its parent node.

The figure shows a structure of a node except the root note. It includesa field 431 for indicating whether the node is a leaf or non-leaf(internal) node, a field 432 for indicating the number of entries in thenode, a field 433 for storing a pointer to a child node, and fields suchas 434 for storing QRMBR. Note that the reference MBR is not present ineach node, except the root node.

The SE CR-tree allows the fanout of internal nodes to increase by fourand that of leaf nodes by two when the MBR size is 16 bytes and theQRMBR size is 4 bytes. This increase in fanout could be larger than theincrease obtained in the PE CR-tree when the size of a node is as smallas one or two cache blocks.

FIG. 4C illustrates the third extension to the basic CR-tree, called theFF (“false-hit free”) CR-tree, that decreases the fanout of leaf nodescompared to the above two extensions that increases the fanout of leafnodes. Since the QRMBR technique is a lossy compression scheme, thesearch result can be a superset of the actual answer for a given query.This can be avoided if the QRMBR technique is applied only to internalnodes and store original, non-relative MBRs in leaf nodes.

The figure shows the structure of a leaf node where object's MBR 441 isstored in the original, non-relative format, together with a pointer toan object 442. The FF CR-tree is useful when the subsequent refinementstep is extremely expensive. For example, when complex boundary polygonsof administrative districts are indexed by their MBRs, the refinementstep of comparing the given query shape with the actual shape of dataobjects obtained by searching an index can be expensive.

Table 1 shows the space requirements for the various embodiments ofindex structures according to various embodiments of the presentinvention, where N is the number of leaf node entries and S is the sizeof a node in bytes. It is assumed that the size of MBR is 16 bytes, thesize of QRMBR is 4 bytes, and the size of the pointer is 4 bytes. Thetypical index sizes are calculated when N is 1,000,000 and S is 128assuming that the nodes are 70% full. Note that the PE R-tree is anextension of R-tree as a result of applying the pointer eliminationtechnique. The internal node space is calculated by dividing the leafspace by the average fanout of internal nodes minus one. This analysisshows that the PE CR-tree is not so different from the CR-tree in termsof the space requirement and the PE R-tree is no different from theR-tree.

TABLE 1 Maximum fanout Node space Typical Tree type Internal LeafInternal Leaf index size R-tree m m NS/0.7m(0.7m − 1) NS/0.7m 38.15 MBPE R-tree 1.25m m NS/0.7m(0.875m − 1) NS/0.7m 35.90 MB CR-tree 2.5m − 42.5m − 4 NS/1.75m − 2.8)(1.75m − 1.8) NS/(1.75m − 2.8) 17.68 MB PECR-tree 5m − 5 2.5m − 4 NS/(1.75m − 2.8)(3.5m − 2.5) NS/(1.75m − 2.8)16.71 MB SE CR-tree 5m − 1 2.5m − 2 NS/1.75m(3.5m − 0.7) NS/(1.75m −1.4) 14.07 MB FF CR-tree 2.5m − 4 m NS/0.7m(1.75m − 2.8) NS/0.7m 32.84MB

Analytical Results

Without loss of generality, a data domain of unit hyper-square isassumed. For simplicity, it is assumed that data objects are uniformlydistributed in the domain, and the query MBRs are hyper-squares. It isfurther assumed that the R-tree nodes of the same height havesquare-like MBRs roughly of the same size like other analytical work.Mathematica 3.0 was used to perform the numerical computation needed tocompare the analytical results visually.

Let h denote the height or level of a node assuming that the height ofleaf nodes is one. Let M_(h) denote the number of nodes with the heightof h. From the above assumption,

$M_{h} = {\left\lceil \frac{N}{f^{h}} \right\rceil.}$

Let a_(h) denote the average area that a node of height h covers. Then,a_(h) is 1/M_(h). Using the Minkowski sum technique, the probabilitythat a node of height h overlaps a given query rectangle is

$\left( {\sqrt[d]{s} + \sqrt[d]{a_{h}}} \right)^{d},$

where s denotes the size of the query rectangle. Then, the number ofheight-h nodes that overlap the query rectangle is

${M_{h}\left( {\sqrt[d]{s} + \sqrt[d]{a_{h}}} \right)}^{d}\mspace{14mu} {{or}\left( {1 + {\sqrt[d]{\left\lceil \frac{N}{f^{h}} \right\rceil} \cdot s}} \right)}{d.}$

By summing this equation from the leaf to the root, the total number ofnodes accessed in R-trees is

$\begin{matrix}{1 + {\sum\limits_{h = 1}^{{\lceil{\log_{f}N}\rceil} - 1}\; \left( {1 + \sqrt[d]{\left\lceil \frac{N}{f^{h}} \right\rceil \cdot s}} \right)^{d}}} & (1)\end{matrix}$

On the other hand, the CR-tree compares QRMBRs in order to determinewhether to visit a node or not while the R-tree compares MBRs. Since aQRMBR is larger than its original MBR by the length of a quantizationcell on average, the number of node accesses increases a bit in theCR-tree.

Let 1 denote the number of quantization levels. Then, each node hasl^(d) quantization cells, and the side length of each cell is

$\sqrt[d]{a_{h}}/l$

where h denotes the height of the node. Since whether to visit a childnode is determined by comparing the QRMBR of the query rectangle and thestored QRMBR of the child node, the probability to visit a child node is

$\left( {\sqrt[d]{s} + {\sqrt[d]{a_{h}}/l} + \sqrt[d]{a_{h - 1}} + {\sqrt[d]{a_{h}}/l}} \right)^{d}.$

By multiplying by M_(h) and summing from the leaf to the root, the totalnumber of nodes accessed in CR-trees is

$\begin{matrix}{1 + {\sum\limits_{h = 1}^{{\lceil{\log_{f}N}\rceil} - 1}\; \left( {1 + \sqrt[d]{\left\lceil \frac{N}{f^{h}} \right\rceil \cdot s} + {\sqrt[d]{\left\lceil \frac{N}{f^{h + 1}} \right\rceil \cdot s}/l}} \right)^{d}}} & (2)\end{matrix}$

FIGS. 5A and 5B plot equations (2) and (3) for the cardinality of onemillion and the query selectivity of 0.01%. It is assumed that thepointer size is 4 bytes and that each node is 70% full. It is alsoassumed that the MBR size is 16 bytes in 2 dimension (“2D”) andincreases linearly with increasing dimension. The QRMBR size is assumedto be a one-fourth of the MBR size.

The analytical result shows that the number of accessed nodes decreaseswith increasing the node size. The decreasing rate is initially large,but it becomes smaller as the node size increases. For all the nodesizes and all the three dimensionality, the CR-tree surpasses the R-treeby more than twice.

Number of Cache Misses

The number of cache misses can easily be calculated by multiplyingequations (2) and (3) by the number of cache misses that a single nodeaccess incurs. To obtain the results, the equations (2) and (3) weremultiplied by S/64, where S is the node size in bytes.

FIGS. 6A and 6B show the calculated number of cache misses for the sameconfigurations as FIGS. 4A and 4B. The analytical results show that asthe node size grows, the number of cache misses approaches quickly to aminimum, and then increases slowly. In terms of cache misses, theCR-tree outperforms the R-tree rather significantly, by up to 4.3 times.

FIG. 6A exhibits a sawtooth-like pattern showing the number of cachemisses decreasing abruptly at certain node sizes while generallyincreasing with the node size. Such bumps occur when the height of treebecomes smaller. For example, the 4D R-tree has the height of 7 when thenode size is 448 bytes or 512 bytes, but its height becomes 6 when thenode size is 576 bytes. In other words, such bumps occur when the gaindue to the decrease of height surpasses the overhead due to the increaseof node size.

Although the optimal one-dimensional node size in terms of the number ofcache misses is shown to be the cache block size mentioned above, FIGS.6A and 6B shows that this choice of node size is not optimal inmultidimensional cases as discussed above.

FIGS. 7A and 7B compare the number of cache misses calculated withvarying query selectivity ranging from 0.001% to 1%. It is observed thatthe optimal node size increases with increasing the query selectivity inboth R-trees and CR-trees.

FIG. 7A shows that the optimal node size increases in the order of 128B(bytes), 192B, 320B, 640B, and 960B as the query selectivity increases.FIG. 7B shows that the optimal node size increases in the order of 64B,128B, 128B, 256B, and 320B as the query selectivity increases. Theoptimal node size increases in the same way as the cardinality and thedimensionality increase.

Ratio of False Hits By Quantization

Each quantization cell of a leaf node has the area of f/l^(d) N and theside length of

$\sqrt[d]{{f/l^{d}}N}$

along each axis, and the probability that the QRMBRs of the query MBRand the object MBR overlap is

$\left( {\sqrt[d]{s} + \sqrt[d]{a} + {2\sqrt[d]{{f/l^{d}}N}}} \right)^{d}.$

Therefore, the probability that a false hit occurs is

${\left( {\sqrt[d]{s} + \sqrt[d]{a} + {2\sqrt[d]{{f/l^{d}}N}}} \right)^{d} - {{\left( {\sqrt[d]{s} + \sqrt[d]{a}} \right)^{d}.{Dividing}}\mspace{14mu} {{by}\left( {\sqrt[d]{s} + \sqrt[d]{a}} \right)}d}},$

the ratio of false hits incurred by quantization to actual answers is

$\left( {1 + {2{\sqrt[d]{{f/l^{d}}N}/\left( {\sqrt[d]{s} + \sqrt[d]{a}} \right)}}} \right)^{d} - 1.$

FIGS. 8A and 8B plot equation (4) when the cardinality is one millionand the query selectivity is 0.01%. Here, it is assumed that the pointersize is 4 bytes and that each node is 70% full. FIG. 8A shows the falsehit ratio in the 2D CR-tree for three different QRMBR sizes: 2 bytes, 4bytes, and 8 bytes, and FIG. 8B shows the false hit ratio for threedifferent dimensionality: 2 dimensions (“2D”), 3D, and 4D. The false hitratio increases with both the node size and the dimensionality. UsingQRMBRs of 4 bytes incurs around one false hit in this configuration, butit saves tens of or hundreds of cache misses as shown in FIGS. 6A and6B.

Experimental Results

To confirm the merits of the CR-tree according to various embodiments ofthe present invention, a series of experiments were conducted on a SUNUltraSPARC platform (400 MHz CPU with 8 MB L2 cache) running Solaris2.7.

Six index structures were implemented: the original R-tree, the PER-tree, the CR-tree, the PE CR-tree, the SE CR-tree, and the FF CR-tree.A bulk-loading algorithm was also implemented. The size of nodes waschanged from 64 bytes to 1024 bytes inside the index structuresimplemented. 16-byte MBRs were used and the size of QRMBRs was changedfrom 2 bytes to 8 bytes. Unless specified, the default size of QRMBRs is4 bytes, and the nodes are 70% full.

Two synthetic data sets were generated, which consist of one millionsmall rectangles located in the unit square. One is uniformlydistributed in the unit square while the other has a Gaussiandistribution around the center point (0.5, 0.5) with the standarddeviation of 0.25. The average side length of rectangles is set to be0.001.

Search Performance

The search performances of various index trees according to variousembodiments of the present invention were compared in terms of thewall-clock time spent processing a two-dimensional region query. 10,000different query rectangles of the same size are generated, whose centerpoints are uniformly distributed. The size of query rectangles waschanged from 0.01% of the data space to 1%. Since the data space is theunit square, the query selectivity is roughly same as the size of aquery rectangle.

FIGS. 9A and 9B show the measured elapsed time spent searching variousindexes bulk-loaded with the uniform data set such that each node is 70%full. As the node size grows, the search time quickly approaches aminimum. After passing the minimum, the search time increases slowly.The minimum moves to the right as the query selectivity increases. Thistrend holds for all the six trees, agreeing with the analytical results.

The CR-tree, the PE CR-tree, and the SE CR-tree form the fastest group.The R-tree and the PE R-tree form the slowest group. The FF CR-tree liesbetween the two groups.

Although the SE CR-tree is wider than both the CR-tree and the PECR-tree, it performs worse. This is because the SE CR-tree calculatesthe reference MBR of a node from the matching entry in its parent node.In the present implementation, this calculation involves about 40instructions and 16 bytes of memory write.

FIGS. 10A and 10B show the measured elapsed time spent searching indexesbulk-loaded with the skewed data set. There is not much noticeabledifference from FIGS. 10A and 10B, indicating that all the six trees aremore or less robust with respect to the skew for any node size.

FIGS. 11A and 11B show that the CR-tree scales well with thecardinality. In this experiment, the size of query rectangles was set tobe the inverse of the cardinality such that the number of found dataobjects is almost same.

Update Performance

To measure the update performance, 100,000 data objects were insertedinto trees bulk-loaded with the 1M uniform data set, then removed100,000 randomly selected data objects from the trees.

FIGS. 12A and 12B show the measured elapsed time per insertion anddeletion, respectively. For a given node size, the CR-tree consumesabout 15% more time than the R-tree when inserting. However, when thefanouts are the same (for example, the CR-tree with the node size of 256bytes and the R-tree with the node size of 640 bytes), the CR-treeperforms about the same or better than the R-tree. The reasons are asfollows.

When descending a tree for insertion, the child node that needs to beenlarged least is selected. Since the enlargement calculation consumesabout 30 instructions in the present implementation, it becomes moreexpensive than the cache miss in the CR-tree and its variants. Since asingle cache block contains about 5.6 QRMBRs in the CR-tree, theenlargement calculation cost is about 168 instructions per cache block,but a cache miss consumes about 80.about.100 processor cycles on 400 MHzUltraSPARC II. On the other hand, since insertion accesses only one nodefor each height, the number of accessed nodes decreases logarithmicallyin the fanout, but the number of enlargement calculations for each nodeincreases linearly with the fanout. Thus, the total number ofenlargement calculations increases with the fanout.

The PE R-tree performs slightly worse than the R-tree because itincreases the fanout by less than 25%. Since the fanout of the CR-treeis about 150% larger than that of the R-tree, it performs worse than theR-tree for a given node size. Since the fanout of the PE CR-tree isabout 400% larger than that of the R-tree, it performs significantlyworse than the R-tree for a given node size. On the other hand, when thefanout is same, the ranking of the CR-tree is determined by the savingin cache misses and the overhead of updating QRMBRs when the node MBRgrows or shrinks.

FIG. 12B shows that the rankings for deletion are slightly differentfrom those for insertion. Deletion is a combination of highly selectivesearch and node update. As was expected from FIGS. 9A and 9B, theCR-tree performs similarly to the R-tree as the query selectivitydecreases. On the other hand, node update becomes more expensive as thenode size increases because the cost of updating QRMBRs increases.Therefore, the CR-tree outperforms the R-tree when the node size issmall, but they cross over as the node size increases.

FIGS. 13A and 13B show the measured search time after the insertion anddeletion. This experiment was conducted to check whether insertion anddeletion affects the quality of the trees. As shown in FIGS. 9A and 9B,these trees are robust with respect to dynamic insertion and deletion.

Impact of Quantization Levels

To assess the effect of a quantization level, the ratio of false hitsincurred by quantization, the quantization error in terms of the MBRsize, and the search time for three different quantization levels, 24,28, and 216 were measured. These correspond to QRMBRs of 2 bytes, 4bytes, and 8 bytes, respectively. The experiment used the treesbulk-loaded with the 1M uniform data set.

FIGS. 14A, 14B and 14C show the ratios of false hits measured fordifferent quantization levels. It was shown above that the false hitratio can be estimated by (1+2√{square root over (f/l²N)}/(√{square rootover (s)}+√{square root over (a)}))²−1. The false hit ratio increaseswith the fanout or the size of a node, and decreases with thequantization level and query selectivity. The measured results agreewith the analytical results. When quantized into 16 bits, the searchresult is almost equal to the exact answer for a query. When quantizedinto 8 bits, the search result contains at most 1% more data objectscompared with the exact answer. Quantizing into 4 bits can be used onlywhen the query selectivity is high.

FIGS. 15A and 15B show the increase in the size of MBRs measured fordifferent quantization levels. The above analysis showed that the sizeof a quantization cell is roughly f/(l²N) for leaf nodes and a QRMBRextends its original MBR by the cell length along each axis. Thus, thesize of QRMBR increases with the fanout or the node size, and decreaseswith the quantization level. The measured results agree with theanalytical results. When quantized into 16 bits, the size of MBRincreases by less than 0.01%. When quantized into 8 bits, the size ofMBR can increase by 1˜7% depending on the node size, but this does notlead to the same increase in the size of search result as shown in FIGS.15A, 15B and 15C.

FIGS. 16A, 16B and 16C show the search time measured with varyingquantization levels. These figures show that a coarse quantization canresult in the increase of the search time when the query selectivity islow and the size of a node is large. This is because of a significantnumber of false hits. In sum, it is possible to quantize into fewer bitsas the query selectivity and the cardinality grows, but it is necessaryto quantize into more bits as the size of node grows.

FIGS. 17A and 17B show the amount of accessed index data, which is theL2 cache misses when no index data is cached initially or in the worstcase of cache misses. In terms of the worst-case cache misses, the sixtrees are ranked from the best to the worst in the order of the SECR-tree, the PE CR-tree, the CR-tree, the FF CR-tree, the PE R-tree, andthe R-tree, from the best to the worst. The first three form one group,and the last two form another group as shown in FIGS. 9A and 9B. Thismeasured result also agrees with the analytical results of FIGS. 6A and6B.

FIGS. 18A and 18B show the measured number of L2 cache misses using thePerfmon tool. The UltraSPARC processors provide two register countersfor measuring processor events. The Perfmon tool was used to make theseregisters count L2 cache misses and to read the values stored in them.The number of L2 cache misses is slightly different from the amount ofaccessed index data because of cache hits and missing instructions. L2cache misses by missing instructions explains why the number of measuredcache misses can be larger than the worst-case cache misses in FIGS. 17Aand 17B when both the node size and the query selectivity are small.

It is also observed that the cache hit ratio increases with the nodesize. This can be explained by the cache replacement policy ofprocessors. Since memory blocks are mapped to cache blocks circularly(for example, by the physical address modulo the cache size), a cachedmemory block is not replaced by consecutive memory blocks. As the nodesize increases, the portion of accesses to consecutive memory blocksincreases, and thus, the cache hit ratio increases subsequently.

FIGS. 19A and 19B show the measured number of key comparisons withvarying selectivity. As opposed to the number of cache misses, the QRMBRtechnique does not reduce the number of key comparisons, but ratherincreases slightly. Since the overlap test between two MBRs consumesless than 10 instructions on average in the present implementation,saving an L2 cache miss is worth saving at least 10 overlap tests. TheR-tree and the PE R-tree have similar fanouts and form one group. The PECR-tree and the SE CR-tree also have similar fanouts and form anothergroup.

Concurrency Control

In order to keep the performance improvement by the CR-tree significant,a matching optimization of index concurrency control schemes is needed.

Since the conventional hash-based lock and unlock operation is tooexpensive for main memory databases, a faster latch and unlatchoperation has been proposed. By allocating latch data structuresstatically and making them directly addressable without a hash, thelatch and unlatch operation uses about 20 CISC (IBM 370) instructions,which may correspond to about a hundred RISC instructions. However, thelatch and unlatch operation is still very expensive for concurrencycontrol of main memory index trees because the present experiment withthe CSB+-tree and the well-known lock coupling technique shows that eachnode is locked for only about 40 processor cycles.

To prevent locking operations from incurring additional cache misses,the data structures for locking needs to be kept within matching indexnodes. To make it possible, the data structure for locking should be assmall as possible. For example, the present proposal uses only one bytefrom each node.

Even if a lock conflict occurs, it will be resolved typically withintens of or hundreds of clock cycles. Therefore, spinning for the lockmay be employed instead of blocking, which incurs a context switchconsuming up to thousands of instructions.

In addition to making a locking operation cheap, it is desirable toreduce the number of locking operations. This is possible by giving afavor to searching in lookup-intensive applications such as in directoryservers. For example, it is possible to make a search operation lock theentire tree if no update is in progress or make a search operation locknothing by using a versioning technique.

Recovery

Since various embodiments of the present invention reduce the index sizeby almost 60%, the checkpointing and post-crash restart processes may beaccelerated, for example, by reducing the disk access time.

In main memory database systems, the durability of transactions isachieved through logging and occasional checkpointing. Checkpointing inmain memory databases is the process of saving a snapshot of thememory-resident database onto a disk. The post-crash restart processconsists of loading the latest snapshot and replaying log recordsgenerated after the latest checkpointing. Therefore, it is clear thatthe disk access time during checkpointing and restart decreases asindexes shrink in size. For example, the disk access time for theCSB+-tree decreases by 10% compared with the B+-tree, and the diskaccess time for the CR-tree decreases to less than half compared withthe R-tree.

FIGS. 20A and 20B show that the analytical results agrees with theexperimental results.

While the invention has been described with reference to variousembodiments, it is not intended to be limited to those embodiments. Itwill be appreciated by those of ordinary skill in the art that manymodifications can be made to the structure and form of the describedembodiments without departing from the spirit and scope of theinvention.

1. (canceled)
 2. A method, comprising: identifying, from an indexstructure stored in memory, a reference minimum bounding shape thatencloses a minimum bounding shape, the minimum bounding shapecorresponding to a data object associated with a leaf node of the indexstructure; associating, using one or more processors, coordinates of apoint of the minimum bounding shape with a set of first values toproduce a relative representation of the minimum bounding shape, the setof first values being calculated relative to coordinates of a referencepoint of the reference minimum bounding shape; and compressing therelative representation using a finite level of quantization to producea quantized representation of the minimum bounding shape and storing thecompressed relative representation in one or more nodes, the one or morenodes having a node size calculated dynamically based on historicalquery selectivity, wherein query selectivity is a measure of how muchdata is expected to be returned by a given query.
 3. The method of claim2, wherein the higher the historical query selectivity, the larger thenode size.
 4. The method of claim 2, wherein the node size is furthercalculated dynamically based upon number of entries in the indexstructure.
 5. The method of claim 3, wherein the greater the number ofentries in the index structure the larger the node size.
 6. The methodof claim 2, wherein the compressing comprises; choosing the finite levelof quantization from a set of quantization levels.
 7. The method ofclaim 2, wherein the index, structure comprises at least, one of anR-tree, an R*-tree, an R-Ktree or a Hilbert R-tree.
 8. The method ofclaim 2, further comprising: responsive to a query, searching the indexstructure using the quantized representation.
 9. A system, comprising:memory to store an index structure; and one or more processorsconfigured to execute a compression engine, the compression engineconfigured to: identify, from an index structure stored in memory, areference minimum bounding shape that encloses a minimum bounding shape,the minimum hounding shape corresponding to a data object associatedwith a leaf node of the index structure; associate, using one or moreprocessors, coordinates of a point of the minimum bounding shape with aset of first values to produce a relative representation of the minimumbounding shape, the set of first values being calculated relative tocoordinates of a reference point of the reference minimum boundingshape; and compress the relative representation using a finite level ofquantization to produce a quantized representation of the minimumbounding shape and store the compressed relative representation in oneor more nodes, the one or more nodes having a node size calculateddynamically based on historical query selectivity; wherein queryselectivity is a measure of how much data is expected to be returned bya given query.
 10. The system of claim 9, wherein the higher thehistorical query selectivity, the larger the node size.
 11. The systemof claim 9, wherein the node size is further calculated dynamicallybased upon number of entries in the index structure.
 12. The system ofclaim 11, wherein the greater the number of entries in the indexstructure, the larger the node size.
 13. The system of claim 9, whereinthe compressing comprises: choosing the finite level of quantizationfrom a set of quantization levels.
 14. The system of claim 9, whereinthe index structure comprises at least one of an R-tree, an R*-tree, anR+-tree or a Hilbert R-tree.
 15. The system of claim 9, wherein thecompression engine is further configured to: responsive to a query,search the index structure using the quantized representation.
 16. Anon-transitory computer-readable storage device storing instructionsthat, when executed by one or more processors, cause the one or moreprocessor to perform operations comprising: identifying, from an indexstructure stored in memory, a reference minimum bounding shape thatencloses a minimum bounding shape, the minimum bounding shapecorresponding to a data object associated with a leaf node of the indexstructure; associating, using one or more processors, coordinates of apoint of the minimum bounding shape with a set of first values toproduce a relative representation of the minimum bounding shape, the setof first values being calculated relative to coordinates of a referencepoint of the reference minimum bounding shape; and compressing therelative representation using a finite level of quantization to producea quantized representation of the minimum bounding shape and storing thecompressed relative representation in one or more nodes, the one or morenodes having a node size calculated dynamically based on historicalquery selectivity, wherein query selectivity is a measure of how muchdata is expected to be returned by a given query.
 17. The non-transitorycomputer-readable storage device of claim 16, wherein the higher thehistorical query selectivity, the larger the node size.
 18. Thenon-transitory computer-readable storage device of claim 16, wherein thenode size is further calculated dynamically based upon number of entriesin the index structure.
 19. The non-transitory computer-readable storagedevice of claim 18, wherein the greater the number of entries in theindex structure, the larger the node size.
 20. The non-transitorycomputer-readable storage device of claim 16, wherein the compressingcomprises: choosing the finite level of quantization from a set ofquantization levels.
 21. The non-transitory computer-readable storagedevice of claim 16, wherein the index structure comprises at least oneof an R-tree, an R*tree, an R+-tree or a Hilbert R-tree.